Multi-branch feature learning based speech emotion recognition using SCAR-NET

نویسندگان

چکیده

Speech emotion recognition (SER) is an active research area in affective computing. Recognizing emotions from speech signals helps to assess human behaviour, which has promising applications the of human-computer interaction. The performance deep learning-based SER methods relies heavily on feature learning. In this paper, we propose SCAR-NET, improved convolutional neural network, extract emotional features and implement classification. This work includes two main parts: First, spectral, temporal, spectral-temporal correlation through three parallel paths; then split-convolve-aggregate residual blocks are designed for multi-branch refined by global average pooling (GAP) pass a softmax classifier generate predictions different emotions. We also conduct series experiments evaluate robustness effectiveness SCAR-NET can achieve 96.45%, 83.13%, 89.93% accuracy datasets EMO-DB, SAVEE, RAVDESS. These results show outperformance SCAR-NET.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech Emotion Recognition Using Scalogram Based Deep Structure

Speech Emotion Recognition (SER) is an important part of speech-based Human-Computer Interface (HCI) applications. Previous SER methods rely on the extraction of features and training an appropriate classifier. However, most of those features can be affected by emotionally irrelevant factors such as gender, speaking styles and environment. Here, an SER method has been proposed based on a concat...

متن کامل

Feature Transfer Learning for Speech Emotion Recognition

Speech Emotion Recognition (SER) has achieved some substantial progress in the past few decades since the dawn of emotion and speech research. In many aspects, various research efforts have been made in an attempt to achieve human-like emotion recognition performance in real-life settings. However, with the availability of speech data obtained from different devices and varied acquisition condi...

متن کامل

Improving of Feature Selection in Speech Emotion Recognition Based-on Hybrid Evolutionary Algorithms

One of the important issues in speech emotion recognizing is selecting of appropriate feature sets in order to improve the detection rate and classification accuracy. In last studies researchers tried to select the appropriate features for classification by using the selecting and reducing the space of features methods, such as the Fisher and PCA. In this research, a hybrid evolutionary algorit...

متن کامل

Emotion Recognition from Speech Using IG-Based Feature Compensation

This paper presents an approach to feature compensation for emotion recognition from speech signals. In this approach, the intonation groups (IGs) of the input speech signals are extracted first. The speech features in each selected intonation group are then extracted. With the assumption of linear mapping between feature spaces in different emotional states, a feature compensation approach is ...

متن کامل

Learning Corpus-Invariant Discriminant Feature Representations for Speech Emotion Recognition

As a hot topic of speech signal processing, speech emotion recognition methods have been developed rapidly in recent years. Some satisfactory results have been achieved. However, it should be noted that most of these methods are trained and evaluated on the same corpus. In reality, the training data and testing data are often collected from different corpora, and the feature distributions of di...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Connection science

سال: 2023

ISSN: ['0954-0091', '1360-0494']

DOI: https://doi.org/10.1080/09540091.2023.2189217